26 research outputs found
Machine Learning of Molecular Electronic Properties in Chemical Compound Space
The combination of modern scientific computing with electronic structure
theory can lead to an unprecedented amount of data amenable to intelligent data
analysis for the identification of meaningful, novel, and predictive
structure-property relationships. Such relationships enable high-throughput
screening for relevant properties in an exponentially growing pool of virtual
compounds that are synthetically accessible. Here, we present a machine
learning (ML) model, trained on a data base of \textit{ab initio} calculation
results for thousands of organic molecules, that simultaneously predicts
multiple electronic ground- and excited-state properties. The properties
include atomization energy, polarizability, frontier orbital eigenvalues,
ionization potential, electron affinity, and excitation energies. The ML model
is based on a deep multi-task artificial neural network, exploiting underlying
correlations between various molecular properties. The input is identical to
\emph{ab initio} methods, \emph{i.e.} nuclear charges and Cartesian coordinates
of all atoms. For small organic molecules the accuracy of such a "Quantum
Machine" is similar, and sometimes superior, to modern quantum-chemical
methods---at negligible computational cost
Tunable Semiconductors: Control over Carrier States and Excitations in Layered Hybrid Organic-Inorganic Perovskites
For a class of 2D hybrid organic-inorganic perovskite semiconductors based on
-conjugated organic cations, we predict quantitatively how varying the
organic and inorganic component allows control over the nature, energy and
localization of carrier states in a quantum-well-like fashion. Our
first-principles predictions, based on large-scale hybrid density-functional
theory with spin-orbit coupling, show that the interface between the organic
and inorganic parts within a single hybrid can be modulated systematically,
enabling us to select between different type-I and type-II energy level
alignments. Energy levels, recombination properties and transport behavior of
electrons and holes thus become tunable by choosing specific organic
functionalizations and juxtaposing them with suitable inorganic components
QM7-X: A comprehensive dataset of quantum-mechanical properties spanning the chemical space of small organic molecules
We introduce QM7-X, a comprehensive dataset of 42 physicochemical properties
for 4.2 M equilibrium and non-equilibrium structures of small organic
molecules with up to seven non-hydrogen (C, N, O, S, Cl) atoms. To span this
fundamentally important region of chemical compound space (CCS), QM7-X includes
an exhaustive sampling of (meta-)stable equilibrium structures - comprised of
constitutional/structural isomers and stereoisomers, e.g., enantiomers and
diastereomers (including cis-/trans- and conformational isomers) - as well as
100 non-equilibrium structural variations thereof to reach a total of
4.2 M molecular structures. Computed at the tightly converged
quantum-mechanical PBE0+MBD level of theory, QM7-X contains global (molecular)
and local (atom-in-a-molecule) properties ranging from ground state quantities
(such as atomization energies and dipole moments) to response quantities (such
as polarizability tensors and dispersion coefficients). By providing a
systematic, extensive, and tightly-converged dataset of quantum-mechanically
computed physicochemical properties, we expect that QM7-X will play a critical
role in the development of next-generation machine-learning based models for
exploring greater swaths of CCS and performing in silico design of molecules
with targeted properties
Genarris: Random Generation of Molecular Crystal Structures and Fast Screening with a Harris Approximation
We present Genarris, a Python package that performs configuration space
screening for molecular crystals of rigid molecules by random sampling with
physical constraints. For fast energy evaluations Genarris employs a Harris
approximation, whereby the total density of a molecular crystal is constructed
via superposition of single molecule densities. Dispersion-inclusive density
functional theory (DFT) is then used for the Harris density without performing
a self-consistency cycle. Genarris uses machine learning for clustering, based
on a relative coordinate descriptor (RCD) developed specifically for molecular
crystals, which is shown to be robust in identifying packing motif similarity.
In addition to random structure generation, Genarris offers three workflows
based on different sequences of successive clustering and selection steps: the
"Rigorous" workflow is an exhaustive exploration of the potential energy
landscape, the "Energy" workflow produces a set of low energy structures, and
the "Diverse" workflow produces a maximally diverse set of structures. The
latter is recommended for generating initial populations for genetic
algorithms. Here, the implementation of Genarris is reported and its
application is demonstrated for three test cases
The Structure of Liquid and Amorphous Hafnia.
Understanding the atomic structure of amorphous solids is important in predicting and tuning their macroscopic behavior. Here, we use a combination of high-energy X-ray diffraction, neutron diffraction, and molecular dynamics simulations to benchmark the atomic interactions in the high temperature stable liquid and low-density amorphous solid states of hafnia. The diffraction results reveal an average Hf-O coordination number of ~7 exists in both the liquid and amorphous nanoparticle forms studied. The measured pair distribution functions are compared to those generated from several simulation models in the literature. We have also performed ab initio and classical molecular dynamics simulations that show density has a strong effect on the polyhedral connectivity. The liquid shows a broad distribution of Hf-Hf interactions, while the formation of low-density amorphous nanoclusters can reproduce the sharp split peak in the Hf-Hf partial pair distribution function observed in experiment. The agglomeration of amorphous nanoparticles condensed from the gas phase is associated with the formation of both edge-sharing and corner-sharing HfO6,7 polyhedra resembling that observed in the monoclinic phase
MADNESS: A Multiresolution, Adaptive Numerical Environment for Scientific Simulation
MADNESS (multiresolution adaptive numerical environment for scientific
simulation) is a high-level software environment for solving integral and
differential equations in many dimensions that uses adaptive and fast harmonic
analysis methods with guaranteed precision based on multiresolution analysis
and separated representations. Underpinning the numerical capabilities is a
powerful petascale parallel programming environment that aims to increase both
programmer productivity and code scalability. This paper describes the features
and capabilities of MADNESS and briefly discusses some current applications in
chemistry and several areas of physics
Recommended from our members
Report on the sixth blind test of organic crystal structure prediction methods.
The sixth blind test of organic crystal structure prediction (CSP) methods has been held, with five target systems: a small nearly rigid molecule, a polymorphic former drug candidate, a chloride salt hydrate, a co-crystal and a bulky flexible molecule. This blind test has seen substantial growth in the number of participants, with the broad range of prediction methods giving a unique insight into the state of the art in the field. Significant progress has been seen in treating flexible molecules, usage of hierarchical approaches to ranking structures, the application of density-functional approximations, and the establishment of new workflows and `best practices' for performing CSP calculations. All of the targets, apart from a single potentially disordered Z' = 2 polymorph of the drug candidate, were predicted by at least one submission. Despite many remaining challenges, it is clear that CSP methods are becoming more applicable to a wider range of real systems, including salts, hydrates and larger flexible molecules. The results also highlight the potential for CSP calculations to complement and augment experimental studies of organic solid forms.The organisers and participants are very grateful to the crystallographers who supplied the candidate structures: Dr. Peter Horton (XXII), Dr. Brian Samas (XXIII), Prof. Bruce Foxman (XXIV), and Prof. Kraig Wheeler (XXV and XXVI). We are also grateful to Dr. Emma Sharp and colleagues at Johnson Matthey (Pharmorphix) for the polymorph screening of XXVI, as well as numerous colleagues at the CCDC for assistance in organising the blind test. Submission 2: We acknowledge Dr. Oliver Korb for numerous useful discussions. Submission 3: The Day group acknowledge the use of the IRIDIS High Performance Computing Facility, and associated support services at the University of Southampton, in the completion of this work. We acknowledge funding from the EPSRC (grants EP/J01110X/1 and EP/K018132/1) and the European Research Council under the European Union’s Seventh Framework Programme (FP/2007-2013)/ERC through grant agreements n. 307358 (ERC-stG- 2012-ANGLE) and n. 321156 (ERC-AG-PE5-ROBOT). Submission 4: I am grateful to Mikhail Kuzminskii for calculations of molecular structures on Gaussian 98 program in the Institute of Organic Chemistry RAS. The Russian Foundation for Basic Research is acknowledged for financial support (14-03-01091). Submission 5: Toine Schreurs provided computer facilities and assistance. I am grateful to Matthew Habgood at AWE company for providing a travel grant. Submission 6: We would like to acknowledge support of this work by GlaxoSmithKline, Merck, and Vertex. Submission 7: The research was financially supported by the VIDI Research Program 700.10.427, which is financed by The Netherlands Organisation for Scientific Research (NWO), and the European Research Council (ERC-2010-StG, grant agreement n. 259510-KISMOL). We acknowledge the support of the Foundation for Fundamental Research on Matter (FOM). Supercomputer facilities were provided by the National Computing Facilities Foundation (NCF). Submission 8: Computer resources were provided by the Center for High Performance Computing at the University of Utah and the Extreme Science and Engineering Discovery Environment (XSEDE), supported by NSF grant number ACI-1053575. MBF and GIP acknowledge the support from the University of Buenos Aires and the Argentinian Research Council. Submission 9: We thank Dr. Bouke van Eijck for his valuable advice on our predicted structure of XXV. We thank the promotion office for TUT programs on advanced simulation engineering (ADSIM), the leading program for training brain information architects (BRAIN), and the information and media center (IMC) at Toyohashi University of Technology for the use of the TUT supercomputer systems and application software. We also thank the ACCMS at Kyoto University for the use of their supercomputer. In addition, we wish to thank financial supports from Conflex Corp. and Ministry of Education, Culture, Sports, Science and Technology. Submission 12: We thank Leslie Leiserowitz from the Weizmann Institute of Science and Geoffrey Hutchinson from the University of Pittsburgh for helpful discussions. We thank Adam Scovel at the Argonne Leadership Computing Facility (ALCF) for technical support. Work at Tulane University was funded by the Louisiana Board of Regents Award # LEQSF(2014-17)-RD-A-10 “Toward Crystal Engineering from First Principles”, by the NSF award # EPS-1003897 “The Louisiana Alliance for Simulation-Guided Materials Applications (LA-SiGMA)”, and by the Tulane Committee on Research Summer Fellowship. Work at the Technical University of Munich was supported by the Solar Technologies Go Hybrid initiative of the State of Bavaria, Germany. Computer time was provided by the Argonne Leadership Computing Facility (ALCF), which is supported by the Office of Science of the U.S. Department of Energy under contract DE-AC02-06CH11357. Submission 13: This work would not have been possible without funding from Khalifa University’s College of Engineering. I would like to acknowledge Prof. Robert Bennell and Prof. Bayan Sharif for supporting me in acquiring the resources needed to carry out this research. Dr. Louise Price is thanked for her guidance on the use of DMACRYS and NEIGHCRYS during the course of this research. She is also thanked for useful discussions and numerous e-mail exchanges concerning the blind test. Prof. Sarah Price is acknowledged for her support and guidance over many years and for providing access to DMACRYS and NEIGHCRYS. Submission 15: The work was supported by the United Kingdom’s Engineering and Physical Sciences Research Council (EPSRC) (EP/J003840/1, EP/J014958/1) and was made possible through access to computational resources and support from the High Performance Computing Cluster at Imperial College London. We are grateful to Professor Sarah L. Price for supplying the DMACRYS code for use within CrystalOptimizer, and to her and her research group for support with DMACRYS and feedback on CrystalPredictor and CrystalOptimizer. Submission 16: R. J. N. acknowledges financial support from the Engineering and Physical Sciences Research Council (EPSRC) of the U.K. [EP/J017639/1]. R. J. N. and C. J. P. acknowledge use of the Archer facilities of the U.K.’s national high-performance computing service (for which access was obtained via the UKCP consortium [EP/K014560/1]). C. J. P. also acknowledges a Leadership Fellowship Grant [EP/K013688/1]. B. M. acknowledges Robinson College, Cambridge, and the Cambridge Philosophical Society for a Henslow Research Fellowship. Submission 17: The work at the University of Delaware was supported by the Army Research Office under Grant W911NF-13-1- 0387 and by the National Science Foundation Grant CHE-1152899. The work at the University of Silesia was supported by the Polish National Science Centre Grant No. DEC-2012/05/B/ST4/00086. Submission 18: We would like to thank Constantinos Pantelides, Claire Adjiman and Isaac Sugden of Imperial College for their support of our use of CrystalPredictor and CrystalOptimizer in this and Submission 19. The CSP work of the group is supported by EPSRC, though grant ESPRC EP/K039229/1, and Eli Lilly. The PhD students support: RKH by a joint UCL Max-Planck Society Magdeburg Impact studentship, REW by a UCL Impact studentship; LI by the Cambridge Crystallographic Data Centre and the M3S Centre for Doctoral Training (EPSRC EP/G036675/1). Submission 19: The potential generation work at the University of Delaware was supported by the Army Research Office under Grant W911NF-13-1-0387 and by the National Science Foundation Grant CHE-1152899. Submission 20: The work at New York University was supported, in part, by the U.S. Army Research Laboratory and the U.S. Army Research Office under contract/grant number W911NF-13-1-0387 (MET and LV) and, in part, by the Materials Research Science and Engineering Center (MRSEC) program of the National Science Foundation under Award Number DMR-1420073 (MET and ES). The work at the University of Delaware was supported by the U.S. Army Research Laboratory and the U.S. Army Research Office under contract/grant number W911NF-13-1- 0387 and by the National Science Foundation Grant CHE-1152899. Submission 21: We thank the National Science Foundation (DMR-1231586), the Government of Russian Federation (Grant No. 14.A12.31.0003), the Foreign Talents Introduction and Academic Exchange Program (No. B08040) and the Russian Science Foundation, project no. 14-43-00052, base organization Photochemistry Center of the Russian Academy of Sciences. Calculations were performed on the Rurik supercomputer at Moscow Institute of Physics and Technology. Submission 22: The computational results presented have been achieved in part using the Vienna Scientific Cluster (VSC). Submission 24: The potential generation work at the University of Delaware was supported by the Army Research Office under Grant W911NF-13-1-0387 and by the National Science Foundation Grant CHE-1152899. Submission 25: J.H. and A.T. acknowledge the support from the Deutsche Forschungsgemeinschaft under the program DFG-SPP 1807. H-Y.K., R.A.D., and R.C. acknowledge support from the Department of Energy (DOE) under Grant Nos. DE-SC0008626. This research used resources of the Argonne Leadership Computing Facility at Argonne National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-06CH11357. This research used resources of the National Energy Research Scientific Computing Center, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DEAC02-05CH11231. Additional computational resources were provided by the Terascale Infrastructure for Groundbreaking Research in Science and Engineering (TIGRESS) High Performance Computing Center and Visualization Laboratory at Princeton University.This is the final version of the article. It first appeared from Wiley via http://dx.doi.org/10.1107/S2052520616007447
Report on the sixth blind test of organic crystal-structure prediction methods
The sixth blind test of organic crystal-structure prediction (CSP) methods has been held, with five target systems: a small nearly rigid molecule, a polymorphic former drug candidate, a chloride salt hydrate, a co-crystal, and a bulky flexible molecule. This blind test has seen substantial growth in the number of submissions, with the broad range of prediction methods giving a unique insight into the state of the art in the field. Significant progress has been seen in treating flexible molecules, usage of hierarchical approaches to ranking structures, the application of density-functional approximations, and the establishment of new workflows and "best practices" for performing CSP calculations. All of the targets, apart from a single potentially disordered Z` = 2 polymorph of the drug candidate, were predicted by at least one submission. Despite many remaining challenges, it is clear that CSP methods are becoming more applicable to a wider range of real systems, including salts, hydrates and larger flexible molecules. The results also highlight the potential for CSP calculations to complement and augment experimental studies of organic solid forms